Detecting and Correcting Contamination in Genetic Data by

نویسنده

  • Matthew Flickinger
چکیده

DNA sample contamination is a serious problem in DNA sequencing studies, and may result in systematic genotype misclassification and false positive associations. While methods exist to detect and filter out cross-species contamination, few methods to detect within-species sample contamination are available. In this paper, we describe methods to identify withinspecies DNA sample contamination based on (1) a combination of sequencing reads and arraybased genotype data; (2) sequence reads alone; and (3) array-based genotype data alone. Analysis of sequencing reads allows contamination detection after sequence data is generated but prior to variant calling; analysis of array-based genotype data allows contamination detection prior to generation of costly sequence data. Through a combination of analysis of insilico and experimentally contaminated samples, we show that our methods can reliably detect and estimate levels of contamination as low as 1%. We evaluate the impact of DNA contamination on genotype accuracy, and propose effective strategies to screen for and prevent DNA contamination in sequencing studies. A This work has been published: Jun, G., Flickinger, M., Hetrick, K.N., Romm, J.M., Doheny, K.F., Abecasis, G.R., Boehnke, M., Kang, H.M. (2012). Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data. Am J Hum Genet 91, 839–848.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Development of RadRob15, A Robot for Detecting Radioactive Contamination in Nuclear Medicine Departments

Accidental or intentional release of radioactive materials into the living or working environment may cause radioactive contamination. In nuclear medicine departments, radioactive contamination is usually due to radionuclides which emit high energy gamma photons and particles. These radionuclides have a broad range of energies and penetration capabilities. Rapid detection of radioactive contami...

متن کامل

An approach to fault detection and correction in design of systems using of Turbo ‎codes‎

We present an approach to design of fault tolerant computing systems. In this paper, a technique is employed that enable the combination of several codes, in order to obtain flexibility in the design of error correcting codes. Code combining techniques are very effective, which one of these codes are turbo codes. The Algorithm-based fault tolerance techniques that to detect errors rely on the c...

متن کامل

Optimal Detection of Oil Contamination at Sea by the FPSO Algorithm

Leakage of oil from pipelines and oil tankers into seas and oceans is ecologically important and can have significant social and economic impacts on the environment. An early detection of deliberate or accidental oil spills can reduce serious hazards that may threaten coastal residents and help identify pollutants. Iran has been surrounded by seas from the north and the south and they provide u...

متن کامل

A hybrid model based on machine learning and genetic algorithm for detecting fraud in financial statements

Financial statement fraud has increasingly become a serious problem for business, government, and investors. In fact, this threatens the reliability of capital markets, corporate heads, and even the audit profession. Auditors in particular face their apparent inability to detect large-scale fraud, and there are various ways to identify this problem. In order to identify this problem, the majori...

متن کامل

Identification of outliers types in multivariate time series using genetic algorithm

Multivariate time series data, often, modeled using vector autoregressive moving average (VARMA) model. But presence of outliers can violates the stationary assumption and may lead to wrong modeling, biased estimation of parameters and inaccurate prediction. Thus, detection of these points and how to deal properly with them, especially in relation to modeling and parameter estimation of VARMA m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016